AITopics | residual network

Auto-Compressing Networks

Neural Information Processing SystemsJun-20-2026, 20:21:08 GMT

Deep neural networks with short residual connections have demonstrated remarkable success across domains, but increasing depth often introduces computational redundancy without corresponding improvements in representation quality. We introduce Auto-Compressing Networks (ACNs), an architectural variant where additive long feedforward connections from each layer to the output replace traditional short residual connections. By analyzing the distinct dynamics induced by this modification, we reveal a unique property we coin as auto-compression--the ability of a network to organically compress information during training with gradient descent, through architectural design alone. Through auto-compression, information is dynamically "pushed" into early layers during training, enhancing their representational quality and revealing potential redundancy in deeper ones. We theoretically show that this property emerges from layer-wise training patterns present in ACNs, where layers are dynamically utilized during training based on task requirements. We also find that ACNs exhibit enhanced noise robustness compared to residual networks, superior performance in low-data settings, improved transfer learning capabilities, and mitigate catastrophic forgetting suggesting that they learn representations that generalize better despite using fewer parameters. Our results demonstrate up to 18% reduction in catastrophic forgetting and 30-80% architectural compression while maintaining accuracy across vision transformers, MLP-mixers, and BERT architectures. These findings establish ACNs as a practical approach to developing efficient neural architectures that automatically adapt their computational footprint to task complexity, while learning robust representations suitable for noisy real-world tasks and continual learning scenarios.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Construction & Engineering (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

4c07fe24771249c343e70c32289c1192-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 19:13:04 GMT

artificial intelligence, machine learning, transformation, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

15de21c670ae7c3f6f3f1f37029303c9-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 20:33:02 GMT

artificial intelligence, machine learning, pruning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

1757af1fe1429801bdf3abf5600f8bba-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 19:55:29 GMT

artificial intelligence, machine learning, residual network, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Principles of Riemannian Geometry in Neural Networks

Michael Hauser, Asok Ray

Neural Information Processing SystemsApr-22-2026, 17:42:43 GMT

This study deals with neural networks in the sense of geometric transformations acting on the coordinate representation of the underlying data manifold which the data is sampled from. It forms part of an attempt to construct a formalized general theory of neural networks in the setting of Riemannian geometry. From this perspective, the following theoretical results are developed and proven for feedforward networks. First it is shown that residual neural networks are nite dierence approximations to dynamical systems of rst order dierential equations, as opposed to ordinary networks that are static. This implies that the network is learning systems of dierential equations governing the coordinate transformations that represent the data. Second it is shown that a closed form solution of the metric tensor on the underlying data manifold can be found by backpropagating the coordinate representations learned by the neural network itself. This is formulated in a formal abstract sense as a sequence of Lie group actions on the metric bre space in the principal and associated bundles on the data manifold. Toy experiments were run to conrm parts of the proposed theory, as well as to provide intuitions as to how neural networks operate on data.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Swapout: Learning an ensemble of deep architectures

Saurabh Singh, Derek Hoiem, David Forsyth

Neural Information Processing SystemsApr-22-2026, 01:55:21 GMT

We describe Swapout, a new stochastic training method, that outperforms ResNets of identical network structure yielding impressive results on CIFAR-10 and CIFAR100.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback

37bc2f75bf1bcfe8450a1a41c200364c-Paper.pdf

Neural Information Processing SystemsMar-23-2026, 06:20:22 GMT

artificial intelligence, machine learning, residual network, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

The Reversible Residual Network: Backpropagation Without Storing Activations

Neural Information Processing SystemsMar-17-2026, 18:55:32 GMT

Residual Networks (ResNets) have demonstrated significant improvement over traditional Convolutional Neural Networks (CNNs) on image classification, increasing in performance as networks grow both deeper and wider. However, memory consumption becomes a bottleneck as one needs to store all the intermediate activations for calculating gradients using backpropagation. In this work, we present the Reversible Residual Network (RevNet), a variant of ResNets where each layer's activations can be reconstructed exactly from the next layer's. Therefore, the activations for most layers need not be stored in memory during backprop. We demonstrate the effectiveness of RevNets on CIFAR and ImageNet, establishing nearly identical performance to equally-sized ResNets, with activation storage requirements independent of depth.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Mean Field Residual Networks: On the Edge of Chaos

Neural Information Processing SystemsMar-17-2026, 15:41:13 GMT

We study randomly initialized residual networks using mean field theory and the theory of difference equations.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

Spatiotemporal Residual Networks for Video Action Recognition

Neural Information Processing SystemsMar-17-2026, 07:55:36 GMT

Two-stream Convolutional Networks (ConvNets) have shown strong performance for human action recognition in videos. Recently, Residual Networks (ResNets) have arisen as a new technique to train extremely deep architectures. In this paper, we introduce spatiotemporal ResNets as a combination of these two approaches.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Filters

Collaborating Authors

residual network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Auto-Compressing Networks

4c07fe24771249c343e70c32289c1192-Supplemental.pdf

15de21c670ae7c3f6f3f1f37029303c9-Paper.pdf

1757af1fe1429801bdf3abf5600f8bba-Paper-Conference.pdf

Principles of Riemannian Geometry in Neural Networks

Swapout: Learning an ensemble of deep architectures

37bc2f75bf1bcfe8450a1a41c200364c-Paper.pdf

The Reversible Residual Network: Backpropagation Without Storing Activations

Mean Field Residual Networks: On the Edge of Chaos

Spatiotemporal Residual Networks for Video Action Recognition